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Abstract. We quantify the density of rational points in the unit sphere 
S n , proving analogues of the classical theorems on the embedding of Q n 
into R™ . Specifically, we prove a Dirichlct theorem stating that every point 
a G S" is sufficiently approximable, the optimality of this approximation 
via the existence of badly approximable points, and a Khintchine theorem 
showing that the Lebesgue measure of approximable points is cither zero 
or full depending on the convergence or divergence of a certain sum. These 
results complement and improve on previous results, particularly recent 
theorems of Ghosh, Gorodnik and Nevo. 



I. Introduction 

1.1. Motivation. The field of Diophantine approximation seeks to quantify 
the density of a subset A in a metric space X. Classical examples include 
the density of Q in M. or of a number field K in its completion. One can also 
study the density of rational points in certain subsets X of MJ 71 , specifically 
level sets of rational quadratic forms on M. m . In this paper we analyze the case 
of spheres S n , deferring the general case (of quadric hypersurfaces in M n+1 ) 
to a forthcoming work [10]. Rational points on the sphere can be represented 
as -p with q £ N and p £ Z n+1 primitive. We want to measure the distance 
between a £ S n and such a point against its complexity q. Unless otherwise 
specified, we will use the supremum norm || • || on R n+1 to measure distance. 

It will be convenient to introduce the following general definition: for subset 
X of lR m and a function : N — > (0, oo), say that a £ X is <f)-approximable in 
X if there exist infinitely many (p, q) £ Z m+1 with ip £ X such that 



1 

a p 

q 



<0(g); (1.1) 



the set of points ^-approximable in X will be denoted by A(<f), X). Note that 
rational points are 0-approximable in X for any positive function 0, and if a 
is irrational then 'infinitely many (p, q) £ Z n+2 with -p £ X' can be replaced 

by 'infinitely many ^p £ Q n+1 R X\ 

The requirement ^p £ X distinguishes the above set-up from the one usually 
considered in Diophantine approximation on manifolds - there one studies 
rates of approximation of points on a manifold X by rational points which do 
not have to lie in X. In other words, in this paper we are studying intrinsic 
Diophantine approximation on manifolds, as opposed to the existing very rich 
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theory of approximation by rational points of the ambient space, see e.g. (TJ 

121 US]- 

The classical case X = R m can be considered as a motivation. With the 
notation 

(j) T (x) = X~ T , 

we have the following basic facts, see [21] : 

• Dirichlet's Theorem on simultaneous Diophantine approximation im- 
plies that any x G M m is </>i + i/ m -approximable. 

• For sufficiently small c > the complement of v4(c0i+i/ m , R m ) is non- 
empty; in fact, the union of complements to A(c4>i+i/ m , K m ), called the 
set of badly approximable vectors, has full Hausdorff dimension. 

• Khintchine's Theorem asserts that, when x i— > x<fr(x) is non-increasing^, 
almost every (resp. almost no) x G M. m is (f>- approximable assuming the 
sum 

oo 

fc=i 

diverges (resp. converges). 

• By a theorem of Jarnfk, the set of r -approximable points, where r is 
at least 1 + 1/m, has Hausdorff dimension 22±1. 

In this paper we will put m = n + 1 and study the case X = S n , the 
Euclidean unit sphere in M n+1 . The question of intrinsic approximation on 
spheres has been studied in the literature implicitly by Dickinson-Dodson [8] 
and Drutu [9_0, and explicitly by Schmutz [22] and Ghosh-Gorodnik-Nevo 
[IT] [12] (we note that in the two latter papers the generality is much wider, 
the subject being S'-rational points on homogeneous varieties). In the present 
paper we use a connection between Diophantine approximation on spheres and 
dynamics/geometry of the quotient of G = SO(n + 1, 1) by the group T of its 
integer points and deduce intrinsic analogues of a number of basic results in 
Diophantine approximation, strengthening what has been known before. 

1.2. Statement of results. Our main theorems follows. The first one 

gives an analogue of Dirichlet's theorem (see also Theorem 14. II for a stronger 
statement): 

Theorem 1.1. There exists a constant C > 1 such that every a G S n is 
C(f>i- approximable in S n . 

Note that in [22] a version of the above theorem was established with <fti 
replaced by 0i/2riog 2 (n+i)] and with an explicit dependence of C on n. Later 
Ghosh, Gorodnik and Nevo [TT] did the same with <pi replaced by 4>i_ e for all 
n and with C dependent on a and e (see §4.11 for a more precise statement of 
their results). 



In fact, the monotonicity assumption is not needed if m > 1. 

2 Both [8] and [9] study approximations by rationals in R n+1 . but use the algebraic nature 
of S n to reduce their problems to intrinsic approximation. 
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We also show that C in the above theorem cannot be replaced by an ar- 
bitrary small constant, by considering the set of a £ S n which are badly 
approximable in S n , that is, not c0i-approximable in S n for some c > 0: 



BA(S n ) ■= la E S n :3c = da) such that V-p, 

q 



1 

a p 

q 



c 

> - 
q 



Analogously to Dani's result [6] on the correspondence between simultane- 
ous Diophantine approximation and homogeneous actions, we show that a £ 
BA(S' n ) if and only if a certain trajectory on G/Y is bounded. Then, using 
[7], we establish 

Theorem 1.2. The set BA(S n ) is thick. 

Here and hereafter we say that a subset of a metric space X if thick if its 
intersection with any nonempty open subset of X has full Hausdorff dimension. 

A correspondence with dynamics also helps us to derive an analogue of 
Khintchine's Theorem, from which, in particular, it follows that BA(S* n ) has 
Lebesgue measure zero. First note that A(<f),X) is the limsup set of the 
family of balls 

b ( -p,<P(q) J : -pe S n n ir 
\q J q 

Since up to a constant the Lebesgue measure of B(~p, 0(g)) is 4>(q) n , it is 
a consequence of the Borel-Cantelli Lemma that if the sum Xli p eS" < K ( /) n 
converges, then the Lebesgue measure of A(<f>, S n ) is zero. Furthermore, it 
follows from [13] that 

# j^p £ S n nQ n+1 : q < ivj < N n 

for all N > (here and hereafter means that the left hand side is bounded 
from above by the right hand side times a constant possibly dependent on n). 
We refer the reader to [5] for a nice introduction on counting rational points on 
varieties, and to [9] where counting results are derived from equidistribution 
of translates of horocycles. Given the above estimate, one can deduce the 
following convergence-type statement for a non- increasing function 0: 

E <k«>" = E E 

-peS' n nQ"+ 1 £ -peS ,n nQ n + 1 ,ge(2< ? ,2 £ + 1 ] 



< # <j ^p £ S n D Q n+1 : q £ (2 e , 2 £+1 ] } 0(2^) n 



< V(2 £+1 )>(2 £ ) n < / (2 f )>(2 £ ) n rf£ 

/■oo 

< / (2 € ) n ~V(2 < ) n rf(2 € ) < yv - v(A0 n 
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Therefore, if the series 

oo 

E^vw" (1.2) 

k=l 

converges, it follows that almost no point a G S n is 0-approximable. 
The following theorem furnishes the converse result: 

Theorem 1.3. For any <j) : N — > (0, oo) such that the function k i— > k<p(k) is 
non-increasing, the Lebesgue measure of A(<fi,S n ) is full (resp. zero) if and 
only if the sum ( j 1.2ft diverges (resp. converges). 

We point out that Ghosh, Gorodnik, and Nevo [12] have recently proven 
various Khintchine-type results for intrinsic approximation on homogeneous 
varieties. In particular, they show that if that if for some a > c-n (where c is 
an explicitly computable constant > 2), 

±peS n nQ n + 1 

then the Lebesgue measure of A((f>, S n ) is full. Although, as noted previously, 
the results of [12] are more general, for approximations by rational points our 
result is much stronger, providing an exact converse to the convergence case 
above. 

The correspondence that is instrumental in deriving all the aforementioned 
results is not new; it was implicitly used by Drutu [9] to compute the Hausdorff 
dimension of sets A(<fr T , S n ) for r > 1 (an analogue of Jarmk's theorem; for 
n = 1 it was done previously by Dickinson and Dodson [8]). However, to the 
best of our knowledge, it has never been stated explicitly before. 

Let us now describe this correspondence and introduce the main ideas be- 
hind our proofs. Let Q : M n+2 — > M be the quadratic form given by 

n+l 

Q^) = J2^~^n + 2- (L3) 

i=l 

Then one can embed S n into the lightcone 

L := {x E R n+2 : Q(x) = 0} 

of Q via a i— > (a,l). Under this embedding, each rational point lp G S n 
determines a line in L and a unique primitive vector (p, q) G Z n+1 x N lying 
on this line. Because changing heights scales distances in L linearly, good 
approximants lp to a G S n correspond to lattice points (p, q) G Z n+2 H L 
which are close to the line through (a, 1). Note that we have changed our 
approximating points from a dense subset to a discrete one, which dynamics 
is better equipped to handle. Let A := Z n+2 fl L. 

Denote by G the group SO(Q) of orientation-preserving linear transfor- 
mations which preserve Q. Let r a G G denote an element which preserves 
W L+1 x {0} and sends (a, 1) to (1, 0, . . . , 0, 1) G L - such an element is not 



Note that the monotonicity condition in this theorem is stronger than in the preceding 
discussion of the convergence case; it is not clear whether or not it can be relaxed. 
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unique for n > 1, see £j2]for more details on the choice of r a . Applying r a to the 
lightcone L, we see that good approximants (p, q) G A become points in r a A 
which are close to the line through (1, 0, 0, 1). Let g t G G be a flow which 
contracts this line exponentially and expands the line through ( — 1, 0, 0, 1) 
exponentially (see §2.1 for an explicit description of g t ). Then points in r a A 
close to this line correspond to small vectors in the lattice gtr a Ao for some 
t > 0. This is the central idea of the paper, and the precise quantitative 
nature of this correspondence is the subject of Lemmas 12.31 and 12.41 below. 

By a lattice in L we will mean a set of the form gA for some g G G. 
Because the stabilizer of A in G coincides with the group T := SO(Q)z of 
integer points of G, the space of lattices in L can be identified with £ := G/Y, 
a homogeneous space with finite G-invariant (Haar) measure. In what follows 
we will denote by \i the probability Haar measure on £. Also let us define a 
function to on C by 

u(A) := min llvll . 

O^vGA 

The proofs of Theorems 11.11 11.21 and 11.31 will be based on the following 
theorem, which is a partial analogue of Theorem 8.5 in [To] . 

Theorem 1.4. Let : [xo, oo) — » (0, oo) be a piecewise C 1 function such that 
the function x h- >■ x<f>{x) in non-increasing. Put 

t = In f 2 j , (1.4) 

Vvn + l(f)(x )J 

and define a function p : [to, oo) — > (0, oo) by 

Pit) = e* ■ r 1 ( -r=jl) ■ ( L5 ) 

T/ien p is non-increasing and the following hold: 

• If a G A(0, S 1 ™) , i/ien i/iere exists a sequence t^ — > oo suc/i i/iai 
w(5f tfc r Q A ) < 2p(t k ); 

• If there exists a sequence t k — > oo such that u(g tk r a A ) < p{tk), then 
a G A(Vn + T(j),S n ). 

In other words, up to constant, a is 0-approximable if and only if the orbit 
g t r a A hits the 'shrinking target' parametrized by p(t) infinitely often. 

1.3. Outline of the paper. In §|2]we analyze the quantitative nature of the 
correspondence between good approximants -p to a and lattices g t r a A with 
small vectors. This analysis culminates in the proof of Theorem 11.41 which 
allows us to change our perspective from approximations on S n to properties 
of trajectories on C. In §3] we study the geometry of the space C by means 
of reduction theory, and prove a version of Mahler's compactness criterion 
( Corollary I3.4p . thus establishing that small values of the function u correspond 
to complements of large compact subsets of £. 

Then in §4]we prove our main results. In §4. II we combine the correspondence 
of §[2]with Mahler's criterion to prove Theorem ll.il now reduced to a statement 
about lattices in L. In fact we prove a stronger statement, Theorem 14.11 which 
establishes the so called uniform (C, |, i)-Dirichlet property of every a G S n . 
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In §4.2l we derive Theorem 11.21 from Dani's result on bounded geodesies on finite 
volume hyperbolic manifolds. Finally in §4. 31 we recall the framework set forth 
by Kleinbock and Margulis in [16] to establish a Borel-Cantelli lemma about 
cuspidal penetrations. We conclude that the set of lattices whose trajectories 
penetrate a sequence of shrinking cuspidal neighborhoods infinitely often is 
either null or full depending on the convergence or divergence of the sum of 
measures of these neighborhoods. We then estimate these measures and, using 
the correspondence defined in Theorem II A\ relate their sum to the convergence 
or divergence of (11.21) . 

1.4. Acknowledgements. The authors are grateful to Cornelia Drutu, Lior 
Fishman, Alex Gorodnik and David Simmons for helpful discussions. The work 
of the first named author was supported in part by NSF grant DMS-1101320. 

2. Good approximations and small vectors 

Let {uj} denote the standard basis on IR n+2 with respect to which Q has 
the familiar form ( II. 3p . We will refer to the group of orientation-preserving 
linear transformations preserving Q as SO(Q), and denote it by G. 

Let e! denote the vector u x + u n+2 and let K = SO(n + 1) be the subgroup 
of G preserving Span(u 1; . . . , u n+1 ). For a £ S n C Span(u l7 . . . , u n+1 ), we 
would like to choose an element r a £ K such that r a a = Ui, or, equivalently, 
r a (a, 1) = ei. As mentioned previously, for n > 1 such an element is not 
unique. However, if we map K to S n via g i— > g(ui), then the stabilizer of 
ui in K is isomorphic to SO(n), identified with the lower right n x n block of 
SO(n + 1). Therefore there is a unique coset r" 1 SO(n) with the property that 
gui = a for any g £ r" 1 SO(ra). Our first goal is to choose a particular section 

S n SO(n + l)/SO(n) — »■ K . 

Note that without loss of generality we can restrict our attention to an open 
neighborhood W of the hemisphere of S n centered at ui, since the union of W 
and its image under reflection covers S n , and all the Diophantine properties 
we consider are invariant under reflection. 



Let 



9t 



cosh(t) — sinh(i) N 

I n 
— sinh(t) cosh(i) 



(2.1) 



and let 

Then one easily checks: 
9tei = 



A := {g t : t £ R} . 



cosh(t) — sinh(t) 

I n 

- sinh(t) cosh(t) 



o 

W 



e *ei. 



Let us also define the horospherical subgroups associated to {gt}- These sub- 
groups capture the dynamically significant behavior of the ^-action. Namely: 
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• the contracting subgroup U := {h G G : gthg-t — > e as t — > oo}; 

• the neutral subgroup H° := {h G G : tft/i = hg t for all £}; 

• the expanding subgroup H := {h G G : g^thgt — » e as £ — >■ oo}. 

One knows that G is locally a product of {7, if and H (that is, the Lie 
algebra of G is the direct sum of the Lie algebras of these three subgroups). 
Additionally, we recall the Iwasawa decomposition of G: 

Theorem 2.1. The mapping U xAxK^-Gisa diffeomorphism. 

The next lemma constructs a section W — >■ K mentioned above: 

Lemma 2.2. There exist two bi-Lipschitz maps W — > K and W H which 
we will denote by a i— > r a and ct H- h a , where W G S n is a neighborhood of 
the hemisphere containing u 1; such that for any a G S n one has 

r a (a,l) = e 1 (2.2) 

and 

/i^ 1 G f/if . (2.3) 

Proof. To prove the lemma we first need to better understand the structure of 
H, the subgroup of G whose Lie algebra is given by 





T 

—X 











X 


| :xGM n J 




T 

X 







(2.4) 



By Theorem I2.1[ every element h G H can be uniquely represented as 

h = ug s k , (2.5) 

where u G U, s G K, fc G K. Let cr : — > K be the projection onto ii', i.e. 
cr(/i) = fc, where /i and k are as in (|2.5|) . This mapping is injective: if we have 
two elements h = ug s k and h' = u'g t k for which k = cr(h) = a(h'), then 

H 3 h' ■ h~ l = u'g t k ■ k^g^u' 1 = u"g t „ s eUAc UH° . 

Since H D UH° is trivial, we have s = t and u = u' . Clearly a is locally 
bi-Lipschitz. 

One readily checks that Ue-i = e 1 (indeed, a change of coordinates identifies 
U as a subgroup of upper triangular matrices), and, as mentioned previously, 
g s B\ = e~ s ei. Therefore, with a(h) = k one has 

h~ 1 e 1 = k~ 1 g~ 1 u~ 1 e 1 = k~ l gj l e 1 = e s k~ 1 ei. 

Since an element h G H is uniquely determined by the image of ei, it follows 
that the mapping 

H — > He 1 , h i-^ h~ 1 e 1 

is locally bi-Lipschitz. In this way we can view H as an n-dimensional sub- 
manifold of the light cone L. Explicitly, using ( 12. 4p one can parametrize this 
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embedding as 



Hex 



A -IN 



Vi + 11*117 

Since we have an embedded copy of S n C L given by points whose last coor- 
dinate is 1, we obtain a map ir : Hex — > S n given by linearly scaling v x by 

1/(1 + iixir). This map is locally bi-Lipschitz. 

We can now define the desired maps. Given a G W, there exists a unique 
x G M n such that 7r(v x ) = (a, 1) G S n . Now define h a £ H such that 
/i^ei = v x , and let r a := a(h a ). 

Since both maps are bi-Lipschitz on W, it remains to show (I2.2p and (12. 3p . 
Note that 

r^ 1 e 1 = 1 S"n(/i- 1 e 1 ) = 7r(v x ) = («,l) 
as needed; and (12.31) follows because, in view of (I2.5p . har' 1 G UA C UH°. □ 

As mentioned in the introduction, the key idea behind our proofs is to restate 
the problem of approximating a G S n as a problem of approximating the line 
through ei by the lattice r a A G C Applying the flow g t contracts ei, and 
by continuity, good approximants to this line will correspond, for some time 
t > 0, to short vectors. We now quantify that relationship. 

Our results will be stated with respect to the sup norm, but due to the 
definition of L, it is often more convenient to work with the Euclidean norm 
on M n+2 , denoted by ||-|| e . By the equivalence of norms on M m , this changes 
the estimate only by a universal constant. Explicitly, 

||x|| < ||x|| e < y/m ||x|| for any x G R m . 

It is worth noting, however, that for points on L we have a better approxima- 
tion. If x G L, then by definition 

n+1 



Q(x) = <^> x] 



n+2 



£4 



i=l 



From this it immediately follows that 

||x|| = \x n+2 \ and ||x|| e = V2\x n+2 \. 

We will use these estimates frequently in what follows. 

Accordingly, since we are interested in the norms of vectors in L under g t 
we compute 

1 



+2 | 



e [x n+2 - xij + e {x n+2 + xi, 



Finally, we will have frequent need for an estimate on the term \\q ■ e x — r 
By definition, 

n+1 

\\q-ei -r a (p,q)\\l = (q - r Q (p, g) x ) 2 + r a (p, q)l 

i=2 



(2.6) 
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Since r a (p,q) G L, we have that q 2 = Y^^i r «(P; <?)? • Combining these esti- 
mates shows that 

||g-ei -r a (p,g)|| e = V 2( l(q ~ r *(P,q)i)- 

We can now justify our remarks that good approximants correspond to small 
vectors. 



Lemma 2.3. Let N > q be such that 



a 



< 



fqN 



< 



Then there exists 



t>0 such that \\gt r a(PjQ)\\ < £ V n + 1 v/v — £ V n + 1 



Proof. By our computations above, we have that because 
the same is true of the Euclidean norm up to a factor, namely 



a- -p 



< 



fqN- 



a p 

q 



< Vn + l- 



fqN 



Multiplying both sides by q and noting that r a is a Euclidean isometry, we 
have 

\qei - r a {p, ~ 



<eVn + lJ±. 



Now observe that if a — -p = 0, then gtr a (p,q) = gtq^i — > as t — > oo, so 
the conclusion of the lemma holds trivially. Otherwise, let be the unique 
point in time when the distance from gtr a (p,q) to the origin is minimized - 
explicitly, this occurs at 

1, fq + r a (p,q)i 



In 



For t = we compute 
\\gur a (p,q)\\ 



.q-r a (p,q)i, 
\(gt*r a (p,q)) n+2 \ = Jq 2 ~r a (p,q)l 



(2.7) 



< \/2g(g-r ct (p,g)i) = ||gei - r a (p, q) \\ t 

q 



< ey/n + l x l — < eyn 



□ 



Lemma 2.4. If for some t > 0, \\gtr a (p, q) \\ < 5, then there exists an N > q 



such that 



a p 



< 



25 

/qN' 



Proof. If we set iV = e*<5, then we must have q < N (this comes from comparing 
the norms of gtr a {p, q) and gtqe-i). By the chain of inequalities 



1 


< 


l 


l 


a p 


a p 




q 




q 


e q 



\qei ~ r (p,g)|| e = ^2g(g - r Q (p, g)i) 



it suffices to estimate the term q — r Q (p, q)i. But by (12. 6p . 

-e*(g - r Q (p,g)i) < \g t r a (p,q) n+2 \ < 5, 
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from which it immediately follows that 

25 2 

q-r a (p,q)i < 2e l 5 = — . 
Plugging this estimate back into the above, we obtain 



a p 

q 



1 



< -V2g(g-r Q (p,g) 1 ) < -yj 2q f 



25 2 



as needed. 



25 

JqN'' 



□ 



Given the above results, we have that a specific approximant -p satisfying 
(II. ip corresponds to a time t* when 



u{g t J a K) < Vn + lq<fr(q) 



and conversely if ||<7tr a (p, q)\\ < q<j>{cf), then 
a Q™ +1 , then r a A D (ei) = 0. The significance of t 



a p 



< 2<j)(q). Moreover, if 
lis trivial observation is 



that, whenever a is irrational, for every element (p, q) G A one has 

llPt^aCP) — )■ 00 as t — )• 00. 

In particular if a ^ Q n+1 and <p is decreasing, then any given approximant ^p 
only works for a bounded length of time. 

It therefore seems reasonable to try and define a non-increasing function 
p(t) with the property that 

p(t) = q ■ 0(g), 

where t is such that gtT a (p,q) is closest to the origin. Indeed, this almost 
works except that t* in ( 12. 7ft depends on all the coordinates of (p, q) as well 
as on a, not just on q. Our goal now is to approximate t* by a value of t 
depending only on q. By our previous estimates on the Euclidean norm, if ^p 
and a satisfy (II. ip . then 



in 



+ l)q 2 <t>(q) 2 > 2q{q-r a (p,q) 1 ) 



(2.8) 



Define t„ = In 



and then define pit) such that , 2 p{t q ) = q<j>(q). 



q 111 \ Vn+^H<i) 7' ""^ «"<-•" u ^'"" jui ^" y^piPV?; 
This gives rise precisely to the expression (jl.5p . Clearly, if 0(x) is defined on 
[x , 00), then p{t) is defined on [to, 00), where t is given by (11.41) . 

We can now prove Theorem 11.41 

Proof. Let us first address the case of a G Q n+1 nS ,n , say a = -p. As mentioned 
before, ^p is 0-approximable in S n for any positive function 0, so it remains to 
show that there is an unbounded sequence t^ such that u(g tk ri D A ) < p(tk)- 
In fact, we will show this estimate holds for all t sufficiently large, with ^p as 
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its own approximant: 



uj(g t ri p A ) < g t ri p (p,q) 



< e 
= Pit) 



\\gt(q,0,...,0,q)\\ 
q ■ e~* 

1 1-1 



Vn + le* 



where these inequalities hold whenever <p 1 y -j=^-A > g, i.e. t > In 



Now suppose that a G S 1 " - is irrational and </>-approximable in 5* n , and let 
-p e S" 1 satisfy ( flTT]) . We will show that llpt r Q (p, g) II < 2p(t q ): 



\gt q r a (p,q)\ 



1 

< - 



- {e tq {q - r Q (p,g)i) + e t "{q + r«(p, g)i)) 

2 (n+l)g0(g) 2 v / nTl0(g) 



2g 



2 VV^+10(g) 2 2 
= a/^ + 1#(<?) 
= 2p(t q ). 

Note that our use of (12.81) is legitimate since our assumption on ip and a 

implies that \\qei — r a (p, q)\\ e is less than \/n + lqcj)(q). 

Conversely, suppose that the lattice g t r a A contains a vector of length less 
than p(t), and let g t r a (p, q) be such a vector. First note that we must have 

q < e*p(t) 

(this follows from comparing the norm of gtr a (p, q) with that of g t (e t p(t) ■ e^, 
and noting that the norm of g t (e t p(t) ■ ei) is precisely p(t)). Furthermore, by 
Lemma [2.41 we have that 



a p 

q 



< 



2p(t) 



So it suffices to prove that p(t) < q<f)(q). But 

Let s = <p~ l ^==^ Y Since the function x i— >■ x<j){x) is assumed to be non- 
increasing, we have 



9 • > s0(s) 



/n+le* 



/n+le* 



/n.+l 



so that 



a — -p 



< Mil < ^r+l<f)(q). 

In fact this last argument shows that the function p(t) is non- increasing: let 
t < t' we claim that pit) > pit'). Indeed, , 2 , t > i== , and since 6 is 
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decreasing, 

S = (f)- 1 ( J— 1 <<T X ( , ) = *'■ 

Since x H- xcj){x) is non-increasing, we have s<fi{s) > s'(j)(s'), which immediately 
yields > p(t') as needed. 

We observed previously that for every (p, q) £ A , ^^(p, ?)|| — ?• oo as 
t increases, and therefore each ~p works only for a bounded length of time. 
Since the sequence i& is unbounded, there must be infinitely many distinct 
approximants, i.e. a £ A(y/n + 10, □ 

3. Reduction theory 

For the proof of our main theorems we also need some background in reduc- 
tion theory for G/Y. We will use a rough fundamental domain for the action 
of T on G in terms of the Iwasawa decomposition (Theorem 12. ip . For r £ M + , 
let A r := {g s . s > — ln(r)}, and define a Siegel set to be a set of the form 

& T . M = KA T M, 

where A T is as above and M C U is relatively compact. 

The following theorem shows that finitely many translates of some Siegel 
set give a rough fundamental domain for Y: 

Theorem 3.1 ([3], §13; see also [17] . Proposition 2.2). There exists a Siegel 
set & = & t ,m and a finite set F = {fi, f m } C G H SL n+2 (Q) swc/j t/iat t/ie 
union Q := U^O/i satisfies 

(1) G = fiT; 

(2) /or any f eGn SL n+2 (Q), the set {7^:^/007/0} /imte. 

Our next goal is to relate the function u to a metric on C. Actually, it will 
be more convenient to do it through the function A : C — > M given by 

A(A) := -lnw(A). (3.1) 

Choose a right-invariant and bi-i^-invariant metric 'disto' on G, normalized 
so that dist c{g s , 9t) = \s — t\. Also denote by 'dist' the induced metric on 
C = G/Y, namely, define 

dist(aA , hA ) := inf dist G (o, hj) . 

Clearly one has dist(oA , hA ) < dista(g,h). A partial converse, where g, h 
are taken from a Siegel set, is known as Siegel's conjecture, proved for G = 
SO(n + 1, 1) by Borefl [U Theorem C]: 

Theorem 3.2. Le & and F be as in Theorem \3 . 1\ Then there exists a constant 
D > such that for each f £ F, any g £ &f and any 7 £ Y, 

dist G (e,o) - D < dist(oA , A ) < dist G (e,o) . 

Now we can state the desired relationship between A and dist: 

4 Borel's proof is known to be incomplete for groups of higher rank, see |17[ Remark 5.6]; 
however since G is of real rank one, it is sufficient for our situation so we cite his result. See 
[Ml Theorem 7.6] and [17l Theorem 5.7] for the correct proof of the general case. 
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Lemma 3.3. sup 9gG | dist(gA , A ) — A(gA )\ < oo. 

Proof. We are going to relate both functions in the statement of the lemma to 
the 74 T -term of the Siegel decomposition of g given by Theorem I3.ll By the 
theorem we have a description 

g = kg s ufa e nr , (3.2) 

where /, E G H SL„ +2 (Q), k E K, s > -ln(r), u E M C U and 7 G Y. 
We first show that u(gA Q ) x e _s (here and hereafter we use notation A x 
B ii A B A). Let TV be a common denominator of all the matrix 
coefficients of fi, . . . , f m and /{" , . . . , f~ x . Then N f~ l G GL n+2 (Z), therefore 
w := N^ 1 f^ei E Z n+2 PI L = A . Since ei is fixed by C/ and contracted by 
g s , s > 0, we have that 

w(#Ao) < ii^wii = ^^^/^(^"Vr 16 !)!! 

= N\\kg s e 1 \\ 
= iVll^e-'eOH 
< e~ s . 

To prove the other bound, note that the terms g s contract scalar multiples 
of ei faster than any other vectors in L, hence for every v G A \ {0} one has 

||^v|| = \\kg s ufav\\ > \\g s ufav\\ 

= 1 \\g s u(Nfihv\\ > ^e~ s HNfMl 

1 1 

> T7e" s — — ||iV/ 4 7v|| > 



n Wu-m " j 1 " - nwu^w 

(the last inequality holds since Nfijv E Z n+2 \ {0}). But u belongs to a 
compact subset of U, hence is uniformly bounded from above; thus 

Lo(gAo) ^> e~ s , as desired. In other words, sup ffeG | A(gA ) — s\ < 00 , where g 
and s are as in (13.21) . In view of Theorem 13.21 to prove Lemma [3.31 it remains 
to show that 

sup I distc(kg s uf, e) — s\ < 00 . 

f€F, keK, u<EM, s>- ln(r) 

But this is immediate from the invariance properties of the metric, compact- 
ness of K, boundedness of MF and the normalization of dist G . □ 

A consequence of the above lemma is a compactness criterion for subsets 
of C, similar to Mahler's Compactness Criterion for SL n (M)/ SL n (Z) [18]. For 
e > 0, consider 

K e := {A E C : w(A) > e} = {A E C : A(A) < log(l/e)} . 

Corollary 3.4. A subset E C C is relatively compact if and only if E C /C e 
for some positive e. 

Proof. The 'only if direction is straightforward by the continuity of u; for the 
other direction, it suffices to show that each K E is bounded, which is immediate 
from Lemma 13.31 □ 
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Clearly /C = C and if 5 > e, then JCs C /C £ ; thus {/C £ : s > 0} gives a com- 
pact exhaustion of C. This makes it possible to interpret the correspondence 
of Theorem 11.41 as a connection between good approximations of a G S n by 
rational points of S n and excursions of trajectories gtr a A in C outside of large 
compact subsets. 

We close the section with another useful corollary: 

Corollary 3.5. There exists C > such that Kq = 0- 

Proof. If no such constant C existed, then for every k G N we could find 
Afc G C such that w(Afc) > k. By Corollary 13. 4[ the collection {A& : k > 1} 
is precompact, and hence has a limit point A. Let v G A be nonzero. By 
the topology on C, there exist vectors v fc G A fc such that v fc — > v. But this 
contradicts to the fact that ||vfc|| > k. □ 



4. Proof of the main results 

4.1. Dirichlet's Theorem. Our goal for this subsection is to derive Theorem 
11.11 from a stronger statement, Theorem 14. 11 Let us introduce the following 
definition; for a subset X of M n+1 and real numbers C, a, b let us say that 
a G X is 

• (C, a, b) -uniformly Dirichlet in X if given any > 1 



3 -p G X with q < N such that 

q 



l 

a p 

q 



• (C, a, b)-Dirichlet in X if 3 A" such that QH) holds for A" > A" . 
In [22] it was shown that every a G S" 1 is (C, 0, 6)-uniformly Dirichlet in S n 



with 



£7 = 4^^2 + 1)] and b 



2[log 2 (n + l)l ' 

Later a systematic study of this property for homogeneous varieties X was 
undertaken in [11], where in particular it has been shown that 

• every a G S n is (1, 0, 6)-Dirichlet in S n for any 

b < 1 /4 n even 
6 < 1/3 n = 3 
^<i + 4^ ™>5 odd 
almost every a G S n is (1, 0, 6)-Dirichlet in 5 n , where 

b < 1/2 n even 
6 < 2/3 n = 3 



6<| + |^ n>5 odd 



In this section we prove 



Theorem 4.1. There exists a constant C such that Va G S n is (C, 1/2, 1/2)- 
uniformly Dirichlet in S n . 
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This, in particular, implies being (1, 0, 6)-Dirichlet for any b < 1/2 and 
improves on all the aforementioned results valid for every a (although for 
odd n, [Ll]'s almost everywhere statements yield a still better approximation). 
Also it is clear that, for b > 0, (C, a, 6)-Dirichlet implies C0 a+ 6-approximable; 
thus Theorem 11.11 immediately follows from Theorem 14.11 Note that our value 
for C, coming from Corollary 13.51 is not effective - it would be interesting to 
get an explicit estimate. 



Proof. Let C be the minimal constant making Corollary 13. 5l true; clearly C > 1 
as witnessed by the standard lattice Ao- Fix a G S n and let N > C > 1. We 
need to find ^peS" such that 



q < N and 



a 



1 

-P 

q 



< 



2C_ 

7W' 



Let t = hi(^r) > 0, and consider the lattice ^r a A G C By Corollary 13 .5\ 
we have that u(g t r a Ao) < C. Let (p, q) G A be such that ||<7tr a (p, q)\\ < C. 
Then q < e t C = N, and by Lemma 12.41 we have that 



a 



P 



< 



2C_ 

7w 



as needed. 

It remains to prove the inequality when 1 < N < C. Let 
Because the diameter of the sphere is 2, for any a G S n we have 



0,1] 



a - T (i,o,...,o,i; 



< 2 < 



2C_ 

JTn 



as desired. 



□ 



4.2. BA(S' n ) and the optimality of Theorem 11.11 We now show that the 
function <pi(q) = - appearing in Theorem 11.11 is optimal in the sense that 
for any faster decaying function ip, there are points in S n which are not ip- 
approximable. Specifically, any badly approximable point will fail to be ip- 
approximable. So to demonstrate the optimality of 4>i, it suffices to show that 
BA(S' n ) is nonempty. Indeed, we will show more, namely that this set is thick. 
The key ingredient here is a dynamical interpretation of the set BA(S" 1 ). It 
will be convenient to denote 

B := {g G G : {g t gA : t > 0} is bounded in C} . (4.2) 
Proposition 4.2. a G BA(S' n ) if and only if r a G B. 

Proof. First suppose that a G BA(S n ), i.e. there exists e > such that a is 
not in A(e<j)i, S n ). Applying Theorem 11.41 with the function := ^ +l <fii, we 
have that for all t sufficiently large, 

u(g t r a A ) > pit), 

where p(t) is given by (11.51) (note in this case <fi is its own inverse): 

_2 A _ _ t e Vn + le l _ e_ 

v^TT 6 )~ 6 v^TT 2 ~2' 



p(t) 
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independent of t. But this, by Corollary 13.41 says precisely that the orbit 
{gtr a A : t > 0} is bounded in C 

Conversely, suppose that r a G B. By Corollary 13.41 this is equivalent to the 
existence of c > such that u(g t r a Ao) > c for every t > 0. Let := -j=f4>i, 
then similarly to the above computaion, p{t) = c/2, therefore 

u(g t r a A ) > 2p{t) = c for all t > . 
By Theorem 11.41 a is n °t contained in A ^ ^== 1; S^J, i.e. a G BA(5' rt ). □ 

Now recall that the following theorem of Dani [7] : 

Theorem 4.3. The set H H B is thick in H. 

Theorem 1 1.21 will follow from showing the set W (lBA(S n ) to be bi-Lipschitz 
to a neighborhood in H R B. 

Proof of Theorem Let W C H be the image of under the mapping 
ot i — ^ /i Q discussed in Lemma 12.21 Since Hausdorff dimension is preserved 
by bi-Lipschitz mappings, it remains to show that W R BA(S* n ) is mapped 
bijectively to W R B. 

By Theorem 14. 2 1 we know that a G BA(S ,fl ) if and only if r a G B, i.e. 

{gtr a A : t > 0} is bounded. 

But 

g t h a A = gtKr^raAo = {g t h r~ l gt 1 )g t r a AQ 

is at a uniformly bounded distance from g t r a Ao, since, by Lemma |2~2| har^ 1 is 
an element of UH°, the product of the neutral and contracting horospherical 
subgroups corresponding to {g t : t > 0}. Thus r a G S if and only if /i a G i3, 
i.e. 

h-.wn BA(s n ) -+WnB 

is a bijection, as needed. □ 

Note that Dani proves this by establishing a stronger property: winning 
in the sense of Schmidt [20]. This has been recently strengthened by Mc- 
Mullen to so-called absolute winning, see [19] for details. Both winning and 
absolute winning properties are preserved by bi-Lipschitz mappings. Conse- 
quently, Theorem 11.21 can be strengthened to an assertion that the set BA(S n ) 
is absolutely winning. 

4.3. Khintchine's Theorem. Our last goal is to prove the divergence case of 
Theorem 11.31 Recall that we are given <f> : N — > (0, oo) such that the function 
k i — y k<fr(k) is non-increasing and the series (11.2j) diverges. Since <j) is decreasing, 
we may extend its domain from N to [1, oo) such that it is piecewise C 1 and 
the function x i— > x<p(x) is still non-increasing. In view of Theorem 11.41 an d 
replacing (j) by -^=0, to prove Theorem 11.31 it suffices to show that 

for a.e. a G S n 3 a sequence tk — > oo such that oj(gt k r a A ) < p{tk) , (4.3) 

where p : (t , oo) — > (0, oo) is associated to 0(-) as in Theorem II .41 This will be 
a consequence of the following theorem - a dynamical Borel-Cantelli Lemma 
describing the ^-action on the probability space (£,/i): 
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Theorem 4.4. For any function p : N — > (0, oo), 

/i({A £ £ : u)(g t A) < pit) for infinitely many t £ N}) = |q 
according to the divergence or convergence of the sum 

oo 

X>(*) n - (4-5) 

i=l 

Proof of Theorem \1.3\ assuming Theorem \4-4\ First let us have a lemma con- 



necting (ED to (145]) : 

Lemma 4.5. Let </>(•) and p(-) fre related via ( II. 5p . T/ien 



oo 

n.-l. 



p(t) n rft<oo^ / x n </>(x) n <ix < oo . (4.6) 



Proof. Using ( II. 5p . one can rewrite ( 14. 5 p as 



Vn + 1 



oo 



dt. 



After a change of variable x = (f)^ 1 ^ v ^j e t ) ; the previous integral becomes 



equal to 



(x) n x n ( -^4 ) dx = - I x n <P(x) n - l <P'(x) dx , 

0(x) 



which, after integration by parts, can be written as 

f°° 1 1 

/ x n ~ 1 (t){x) n dx + -x%(j>(xo) n - lim -x n (j){x) n . 
Jx n x ^°° n 

But since the function x<j)(x) is non- increasing, the last term above is finite, 
and thus the two integrals in 04. 6 p converge or diverge simultaneously. □ 

Now back to the proof of Theorem II .31 As before, without loss of generality 
we can restrict our attention to a £ W. Suppose that (14. 3p fails, that is, there 
exists a subset W of W of positive measure consisting of a such that 

Va G Wo, u(g t r a A ) > p{t) for large enough t £ R . (4.7) 

Now take a small neighborhood B of identity in UH°, recall the map a t— > h a 
from Lemma [2 .2\ and write, for g £ B, 

g t gh a A = g t {gKr~ l )g^ t gtr a Ao . 

In view of (12.31) . we have gh a r~ l is contained in UH°, and moreover, in a 
fixed (dependent on B and Wo) subset of UH°. Arguing as in the proof of 
Theorem ll.2| we see that there exists a compact subset M of G such that for 
any a £ Wo and g £ B, one has g t gh a A = g'gtr a A for some g' £ M. This 
and ( 14 .7p imply the existence of a constant c > such that 

Va £ Wq and g £ B, u(g t gh a A ) > cp{t) for large enough t £ R. 

But since the product map U x H° x H — > G is a local diffeomorphism and 
the map a i — y h a is bi-Lipschitz, we can conclude, by Fubini's Theorem, that 
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the Haar measure of g G G such that uj(gtgAo) > cp(t) for large enough t is 
positive. Therefore the set in (14.41) . with p replaced by cp and extended to 
N H [l,io] m an arbitrary way, does not have full measure. By Theorem I4.4[ 
the sum (14. 5 p converges, and by the monotonicity of p, so does the integral 
It^ p(f) n dt. Thus, by Lemma 14.51 and the regularity of 0, the sum (jl.2p also 
converges, contradicting our assumption. □ 

Now let us turn to Theorem 14.41 Its convergence part (which we do not 
need for the proof of Theorem I1.3P is a straightforward consequence of the 
Borel-Cantelli Lemma and the following fact: 

Lemma 4.6. For all e > one has 

p{£ n /C £ ) = /i({A G C : w(A) < e}) x e n . 

As for the divergence part, one needs to verify certain quasi-independence 
conditions on the pt-preimages of sets C \ fC £ . Such methods date back to 
the work of Sullivan [23] and Kleinbock-Margulis [IB] ; in fact we are going to 
derive Theorem 14.41 from one of the main results of [16] : 



Proof of Theorem \4-4\ From the continuity of the G-action on L it follows 
that the function A defined in (13.11) is uniformly continuous, and Lemma [4.61 
amounts to saying that 



//({A G C : A (A) > z}) x e 



HZ 



In other words, in the terminology of [TB|, A is n-DL. Thus (TBJ Theorem 1.7] 
applies, and one can conclude that the family of super- level sets of A, 

{ {A G C : A(A) >z}:zeR}, 

is Borel-Cantelli for g\. The latter by definition means that for any sequence 
{E t : t G N} of sets from the above family one has 

/i({A G C : g t {A) G E t for infinitely many t G N}) = I ° lf ^* =1 < °° 

I 1 otherwise 

which is precisely the conclusion of Theorem 14.41 in view of Lemma 14.61 □ 

It remains to write down the 

Proof of Lemma \4-6] In light of Theorem 13.11 and Lemma I3T3"1 to prove Lemma 
14.61 it suffices to show that for r and M as in Theorem 13.11 one has 

fi({g G 6 t ,m : dist G (<?, e) > z}) x e~ nz . (4.8) 

We remark that it follows from [161 Lemma 5.6] that (14.81) holds with some 
explicitly computable k in place of n. However, for completeness, we give the 
proof here. Consider the projection G = KxAxU— > A. The Haar measure 
on G is pushed forward by this mapping to a measure proportional to 5(a)da, 
where da is the Lebesgue measure on A and 5(a) is the modulus of conjugation 
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by a on U. Explicitly, we can compute 5(a) as follows: the Lie algebra u of U 
can be described in coordinates as 



u = 





T 

X 


o\ 









X 


:xGM n | 




T 

X 


V 





Clearly dim(u) = n. Conjugation by g s acts on u as scalar multiplication by 
e~ s , hence S(g s ), which is the determinant of the map Ad(g s ) : u — > u, equals 
e~ ns . By the discussion of [THJ §5], it suffices to show that dist G on A satisfies 

/ e~ ns ds x e~ nz . 

J {g s :dist c (g a ,e)>z} 

But since disto(^ s , e) = s, this immediately follows. □ 
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